The Vibecoder's Guide to AI-trix

The Vibecoder's Guide to AI-trix

A terminal-styled checklist and template for building production-ready agentic systems.

The narrative spine that walks developers from "what is an agent" to running multi-agent workflows in production with strict budget, safety, and reliability guardrails. Replaces long fragile prompts with a versioned, self-correcting agent contract.

Open source template github.com ↗
Type
Narrative Spine & On-ramp Template
Runtime
Agent Framework Independent Ollama Offline Fallback
Status
Template Available Production Proven
Primary Focus
Breaking context loops, cost gating, and schema validation
Repository

Introduction: Reversing the Docs to Code Ratio

Prompting is fragile. A long prompt gets edited, trimmed, and "improved" over time. After a few iterations, the agent drifts and no longer follows original constraints. AI-trix is a structured contract that separates system rules from instructions, ensuring quality, security, and operational durability.

This guide represents the structural spine of my portfolio projects. At each production concern, we point directly to the matching open-source repository as runnable, cloneable proof.

Part A: System Identity & Core Rules

A production-grade agent does not rely on soft prompts. It follows a strict identity rule set.

Part A: Core Rules
1. Reality-First - Truth lives in memory/reality/*.yaml, not in the docs. - If docs and code conflict, trust the reality directory. - Before proposing changes, check current-state.md. 2. Memory Routing - Short-term: Local state store (Redis/Zustand/Blackboard). - Operational: current-state.md and memory/reality/*.yaml. - Long-term: SQLite/persistent store for solution recall. - Hallucination guard: If a feature is not in reality/*.yaml, assume it does not exist.

The Proof: Persistent recall is implemented in Agent-Recall (solution memory via token overlap) and failure memory is managed by Agent-Scars.

Part B: Agent Design Template

Every agent inside a multi-agent swarm must follow a strict input-process-output contract. Passing unstructured strings between agents invites system entropy.

Part B: Agent Contract
### Input Structure - taskDef: {goal: string, context: string, constraints: string[]} - state: Current central state object - solutions: Prior similar solutions from solutionMemory ### Output Schema (Always JSON) { "success": true, "output": "structured_result", "confidence": 0.95, "cost_usd": 0.0012, "latency_ms": 1200, "reasoning": "rationale here", "error": null }

The Proof: This modular anatomy of an agent is dissected and demonstrated in Agent-Anatomy.

Part C: Reality Files & Anti-Drift

When agents modify code, they drift from the documentation. Reality files act as a YAML-based ledger detailing exactly what is currently operational, experimental, or still stubbed.

Part C: Reality Structure
# reality: database_subsystem subsystem: db_connector last_updated: 2026-06-06 status: operational working: connection_pooling: reality: "Acquires connections from neon pool with 10s timeout" confidence: 0.98 stubs_that_look_real: cache_invalidation: reality: "Returns static true; does not invalidate redis cache" risk: "Operators assume cache updates instantly" mitigation: "Check logs for bypass flag"

The Proof: Using a separate constitution to audit agent outputs and detect drift against original rules is demonstrated in Agent-Constitution, while the runtime anti-drift prompt guarding is handled by Agent-Scars.

Part D: Scars & Incident Logging

When an agent makes a mistake, it must be recorded as a "scar." If the same error pattern is detected twice, a repeat guard block is immediately prepended to the system prompt to force self-correction.

Part D: Incident Scarring
2026-06-06 | ESM Import in CommonJS | src/db.js | Use require() instead of import | Never mix modules

The Proof: This pattern is fully implemented in Agent-Scars using a local SQLite instance with an automatic JSON fallback.

Part E: current-state.md

The `current-state.md` file is the master reference for human operators and AI agents alike. It tracks the current development phase, verified working features, partially completed stubs, and the top operational risks. It prevents agents from rebuilding working systems or claiming incomplete stubs as fully operational.

Part F: Security DNA Checklist

Multi-agent systems require rigorous security guardrails at the prompt and runtime level to prevent command injection and unauthorized data modification.

Security DNA Interactive Checklist

The Proof: These defense patterns, including prompt-injection filters, are described in Agentic Patterns and validated in Agent-Routing.

Part G: Observability

Every agent action must emit a structured event to a central stream. This allows the system to monitor token burn, API latency, and confidence distributions.

Part G: Telemetry Schema
{ "timestamp": "2026-06-06T15:45:00Z", "agent": "Reviewer", "taskId": "task-code-audit", "action": "llm_call", "status": "success", "metadata": { "cost_usd": 0.0034, "latency_ms": 950, "tokens_used": 2800, "confidence": 0.98 } }

The Proof: This telemetry structure runs continuously inside the core loop of AgentKernel.

Part H: Fallback & Recovery

When API providers fail, the system must degrade gracefully without crashing. A recovery supervisor monitors stuck loops, restarts crashed agents, and routes calls down fallback provider chains.

Part H: Fallback Cascade
Call Primary Model (e.g. OpenAI) ↓ on timeout (10s) or error Call Fallback 1 (e.g. Gemini) ↓ on timeout (15s) or error Call Fallback 2 (e.g. Anthropic) ↓ on timeout (20s) or error Call Local Fallback (Ollama offline model) ↓ all fail Return graceful execution error.

The Proof: This cascade and the session circuit breaker logic are implemented in Agent-Routing.

Part I: Testing & Validation

Multi-agent templates require extensive unit and integration testing. Rather than relying on live LLM calls during testing, you must assert that:

  • Input validation catches malformed types and prevents path traversals.
  • JSON parsing functions recover gracefully from truncated markdown blocks.
  • The budget circuit breaker trips instantly when simulated model call costs exceed constraints.
  • The fallback model chain invokes local Ollama services when external networks are simulated as down.

Part J: Minimum Documentation Structure

A production-ready repository must possess a clear memory structure. We enforce a minimal directory template:

.claude/
├── CLAUDE.md                 ← System identity & core rules
├── ARCHITECTURE.md           ← Detailed system design
└── PATTERNS.md               ← Code implementation rules

memory/
├── current-state.md         ← Operator state ledger
├── scars.md                 ← Failure memory incidents
└── reality/
    └── llm-providers.yaml    ← Live API provider mapping

Part K: Customization Checklist

Before launching a new agent workspace using the AI-trix template, check off the configuration criteria:

  • Assign a unique system name and namespace to prevent state namespace collisions.
  • Set the hard budget limits (e.g. maximum $2.00 per execution loop).
  • Establish fallback providers and ensure local Ollama has download routes for offline execution models.
  • Populate initial subsystem configurations in `memory/reality/` to initialize the reality-first truth engine.
Clone the AI-trix Template on GitHub →